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Type 2 diabetes (T2D) disproportionally affects African Ameri- 
cans (AfA) but, to date, genetic variants identified from genome- 
wide association studies (GWAS) are primarily from European 
and Asian populations. We examined the single nucleotide 
polymorphism (SNP) and locus transferability of 40 reported 
T2D loci in six AfA GWAS consisting of 2,806 T2D case subjects 
with or without end-stage renal disease and 4,265 control 
subjects from the Candidate Gene Association Resource Plus 
Study. Our results revealed that seven index SNPs at the TCF7L2, 
KLF14, KCNQ1, ADCY5, CDKAL1, JAZF1, and GCKR loci were 
significantly associated with T2D (P < 0.05). The strongest asso- 
ciation was observed at TCF7L2 rs7903146 (odds ratio [OR] 1.30; 
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P = 6.86 X 10 ). Locus-wide analysis demonstrated significant 
associations {P fmiv < 0.05) at regional best SNPs in the TCF7L2, 
KLF14, and HMGA2 loci as well as suggestive signals in KCNQ1 
after correction for the effective number of SNPs at each locus. 
Of these loci, the regional best SNPs were in differential linkage 
disequilibrium (LD) with the index and adjacent SNPs. Our find- 
ings suggest that some loci discovered in prior reports affect T2D 
susceptibility in AfA with similar effect sizes. The reduced and 
differential LD pattern in AfA compared with European and Asian 
populations may facilitate fine mapping of causal variants at loci 
shared across populations. Diabetes 62:965-976, 2013 




Type 2 diabetes (T2D) is a major public health 
problem affecting 25.8 million people in the U.S. 
(1). Marked racial differences in its prevalence 
have been observed, with African American 
(AfA) adults >40 years of age having nearly twofold higher 
prevalence than European Americans (27.1 and 15.5%, 
respectively) (2). In addition to socioeconomic and be- 
havioral risk factors, genetic factors are likely contributors 
to T2D risk in AfA (3). 

Genome-wide association studies (GWAS) for T2D and 
related traits have successfully identified >50 loci with 
common genetic variants associated with T2D risk in pri- 
marily European-descent populations (4-14) and more 
recently in East and South Asians (15-21). The reported 
index single nucleotide polymorphisms (SNPs) at these 
loci have been replicated in multiple populations (22-24) 
but less successfully in AfA (25-27). Although differences 
in environment and lack of study power may partly ac- 
count for the lack of transferability across ethnicities, 
differences in linkage disequilibrium (LD) patterns, effect 
sizes, and risk allele frequency also likely impact the rep- 
lication of index SNPs. Although the long-range LD in 
European populations allows for the identification of T2D 
loci using less dense markers, causal variants are not 
distinguishable from other nearby SNPs in high LD. This 
issue prompts the need to examine T2D loci in other 
populations with different allelic and LD architecture, 
which may help fine mapping of the underlying functional 
variants (28). 

We performed a comprehensive evaluation of the LD 
region of T2D loci reported in European and Asian GWAS 
in a meta-analysis of six AfA GWAS. By testing the index 
and nearby SNPs, we evaluated the transferability of the 
previously reported loci for T2D association in AfA. We 
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demonstrated that the reduced and differential LD struc- 
ture in AfA facilitated fine mapping of regions potentially 
harboring causal variants at some T2D loci. 

RESEARCH DESIGN AND METHODS 

Subjects. The study samples include AfA from the National Heart, Lung, and 
Blood Institute's (NHLBI's) Candidate Gene Association Resource (CARe) and 
the Wake Forest School of Medicine (WFSM) study. CARe is an NHLBI shared 
resource comprised of five cohorts with multiple phenotypes for GWAS in 
AfA. The study design of CARe has been described in detail elsewhere (26). 
The five CARe cohorts are as follows: Atherosclerosis Risk in Communities 
(ARIC), Coronary Artery Risk Development in Young Adults (CARDIA), 
Cleveland Family Study (CFS), Jackson Heart Study (JHS), and Multi-Ethnic 
Study of Atherosclerosis (MESA). Details of the study cohorts are described in 
the Supplementary Data. Written informed consent was obtained from all 
study participants. Recruitment and sample collection procedures were ap- 
proved by the institutional review board from the respective institutions. The 
clinical characteristics of all cohorts are summarized in Table 1. 
Clinical definitions. T2D was diagnosed according to the American Diabetes 
Association criteria (29) with at least one of the following: fasting glucose 
>126 mg/dL, 2-h oral glucose tolerance test glucose >200 mg/dL, random 
glucose >200 mg/dL, use of oral hypoglycemic agents and/or insulin, or physician- 
diagnosed diabetes. Subjects diagnosed before 25 years of age were excluded. 
Normal glucose tolerance was denned as fasting glucose <100 mg/dL and 
2-h oral glucose tolerance test glucose <140 mg/dL (if available) without 
reported use of diabetes medications. Control subjects <25 years of age were 
excluded. 

Genotyping, imputation, and quality control. Genotyping was performed 
using the Affymetrix Genome- Wide Human SNP Array 6.0 in all samples. For the 
CARe study, genotyping, quality control, and data analyses were performed 
centrally by the CARe analytical group at the Broad Institute, and details are 
described elsewhere (26). For the WFSM study, genotyping was performed at 
the Center for Inherited Disease Research (CIDR), and analyses were per- 
formed at WFSM and described elsewhere (30,31). For all studies, imputation 
was performed using MACH with the function -mle (version 1.0.16, http:// 
www.sph.umich.edu/csg/abecasis/MaCH/) to obtain missing genotypes and 
replace genotypes inconsistent with reference haplotypes. In general, SNPs 
with call rate >95% and minor allele frequency (MAF) >1% that passed study- 
specific quality control were used for imputation (26,32). A 1:1 HapMap II 
(NCBI Build 36) CEU:YRI (EuropeamAfrican) consensus haplotype was used 
as reference. Imputation was performed in two steps. The first step selected 
a random subset of unrelated samples to calculate recombination and error 
rate estimates. The second step used these rates to impute all samples across 
the SNPs in the entire reference panel. A total of 2,333,531-2,907,112 SNPs 
from each study with call rate >95%, MAF >1%, minor allele count (MAC) 
>10, and Hardy- Weinberg P value >0.0001 for genotyped SNPs and MAF >1%, 
MAC >10, and RSQ >0.5 for imputed SNPs were included in subsequent data 
analyses. 

Principal component analysis. Principal component (PC) was computed on 
each study by using high-quality SNPs (26,32,33). The first PC was highly 
correlated (r 2 >0.87) with global African-European ancestry, as measured by 
ANCESTRYMAP (34), STRUCTURE (35), or FRAPPE (36). The AfA samples 
had an average of 80% African ancestry. By analyzing unrelated samples from 
all studies using SMARTPCA (33), only the first PC appeared to account for 
substantial genetic variation in the screen plot (data not shown), whereas the 
subsequent PCs may reflect sampling noise and/or relatedness in samples (34). 
The first PC was used as a covariate in the association analyses to adjust for 
population substructure. 
Statistical analyses 

Single SNP association and meta-analysis. In each study, the association 
of genotyped and imputed (in dosage) SNPs with T2D was assessed under an 
additive model with adjustment for age, sex, study center (if applicable), and 
the first PC. Age at the last visit with other clinical parameters available for 
prospective studies (ARIC, CARDIA, CFS, and MESA), or at baseline for JHS 
and WFSM studies, was analyzed. Association tests were performed using 
logistic regression in PLINK (http://pngu.mgh.harvard.edu/purcell/plink) in 
unrelated samples and generalized estimating equation in GWAF in R (v2.9.0) 
(37) in related (CFS) samples. Association results with extreme values (ab- 
solute p coefficient or SE >10), primarily due to low cell counts from small 
sample size and low MAF, were excluded. After genomic control correction 
within each study, association results were combined by fixed-effect inverse 
variance weighting implemented in METAL (http://www.sph.umich.edu/csg/ 
abecasis/metal/). Results from SNPs with <50% samples analyzed and those 
with allele frequency difference >0.3 among studies were excluded. A total of 
2,739,003 SNPs were analyzed in the meta-analysis. The mean SNP call rate 
was 99% in the locus-specific meta-analysis. 
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TABLE 2 

Association of reported T2D index SNPs in AfA 
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Chr, chromosome. *ZBED3 rs4457053 and HNF1B rs4430796 failed quality control and results were not reported. tRisk allele (RA) and 
nonrisk allele (NRA) as reported previously in European or East Asian populations. Alleles were indexed to the forward strand of NCBI Build 
36. ^Risk allele frequency (RAF) in AfA from this study. § Significant associations (P < 0.05) are set in boldface type. 1 1 Heterogeneity P values 
across studies for index SNPs. 



Locus-specific analysis. We used two approaches to test for the trans- 
ferability of 40 T2D loci in AfA. First, the most significant independent index 
SNPs (P < 5 X 10" 8 ) from T2D loci identified through GWAS of T2D and 
related glucose homeostasis traits were selected from the catalog of pub- 
lished GWAS at the National Human Genome Research Institute until De- 
cember 2010 (http://www.genome.gov/gwastudies) (38) (Supplementary 
Table 1). At the CDC123-CAMK1D and KCNQ1 loci, independent T2D in- 
dex SNPs had been identified in European and East Asian populations, 
respectively (10,12,16,19). At the C2CD4A-C2CD4B locus, rs7172432 was 
associated with T2D in East Asians (17). A nearby independent index SNP 
rsl 1071657 showing strong association with fasting glucose but modest 
association with T2D in Europeans (13) was also examined. Additionally, 
two index SNPs from PPARG and HNF1B that did not reach genome- 
wide significance were selected due to candidacy and consistent replica- 
tion. Second, the region of interest for each locus was defined as the 
boundary of the farthest SNPs that show LD at r 2 >0.3 with the index SNP 
in CEU or JPT+CHB (Asian [ASN]) populations and further extended by 



100 kb. These regions will likely harbor causal variants that are in LD with 
the index SNPs reported by the original GWAS. This approach takes into 
account varying LD block size across the genome and absence of corre- 
lated SNPs for some index SNPs in HapMap. Regional pairwise LD was 
calculated in SNAP (http://www.broadinstitute.org/mpg/snap/) using the 
HapMap II release 22 CEU and ASN data for loci reported in Europeans and 
Asians, respectively. The regions of interest range from 200 to 807 kb 
(Supplementary Table 1), and the effective number of SNPs range from 45 
to 156. 

In the first approach, SNPs were examined for transferability by directly 
testing the reported index SNP for T2D association. SNP-specific significance 
was considered as P < 0.05 in the same direction of association in prior 
reports. In the second approach, locus transferability was assessed by testing 
all SNPs in the region of interest. In each locus, the most significant SNP was 
defined as the best SNP. The effective number of SNPs (independent SNPs) 
was estimated from the eigenvalues of the covariance matrix of the SNPs in 
each locus using the Li and Ji method implemented in SOLAR (39). Empirical 
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locus-specific P values, P emp , were adjusted for multiple comparisons by 
Bonferroni correction for the effective number of SNPs. 
Conditional and haplotype analyses. For loci showing significant regional 
associations, logistic regression was performed conditioned on both the index 
and best SNPs to reveal the presence of independent or residual associ- 
ations. Haplotype analysis of the index and best SNPs was also performed by 
a haplotype-specific test using -chap in PLINK to compare differences of 
frequencies of each haplotype with a reference haplotype between cases and 
controls. Best-guess genotypes were analyzed for imputed SNPs. All analyses 
were performed separately in each study, adjusted for age, sex, study center, 
and the first PC. The conditional analyses also adjusted for inflation factor from 
GWAS in each study. Effect sizes were then combined by meta-analysis. 
Population differentiation and natural selection. Four methods were 
applied to evaluate whether the differences in genetic architecture between the 
ancestries of Af A or between the discovery populations and Af A account for the 
differential association signals for AfA in this study. For the index SNPs, the 
absolute difference of risk allele frequency was assessed between AfA and CEU 
(or ASN) populations for loci identified in Europeans and East Asians, re- 
spectively. We also assessed two matrices using Haplotter, F ST for measure- 
ment of population differentiation, and integrated haplotype score for the 
detection of recent positive selection in the CEU (or ASN) and YRI pop- 
ulations (40). To assess for interpopulation differences in LD patterns, the 
varLD method was used to assess genome-wide distribution of varLD scores 
between CEU (or ASN) and YRI (41). The varLD scores were standardized, 
and the 100-kb regions flanking the index T2D SNPs were examined. A stan- 
dardized varLD score exceeding the 95th percentile of the distribution was 
considered a significant LD difference between the studied populations. 
Power analysis. Posterior study power was calculated using the genetic 
power calculator (42) under an additive model, using the SNP-specific effec- 
tive sample size (43) of this study and reported effect sizes from the replica- 
tion phases (wherever available) or all phases in prior T2D reports to minimize 
winner's curse effect. 

All statistical tests were performed by PLINK, GWAF, or SAS v.9.1 (SAS 
Institute, Cary, NC), unless otherwise specified. A nominal P value <0.05 for 
index SNPs was considered significant. A Bonferroni P value (P emp ) <0.05 
corrected for the effective number of SNPs was considered significant for 
regional SNPs. 

RESULTS 

Clinical characteristics of the study samples. Clinical 
characteristics of the six GWAS cohorts are shown in Ta- 
ble 1. A total of 2,806 T2D case subjects and 4,265 control 
subjects (6,701 effective sample size) were included. The 
mean age at diagnosis of T2D in case subjects varied from 
35.0 to 54.6 years among studies. 

GWAS and meta-analysis. A total of 2.3-2.9 million SNPs 
that passed quality control were tested for association with 
T2D in each cohort separately. Inflation factors for the 
associations were 1.022 for ARIC, 1.020 for CARDIA, 1.084 
for CFS, 1.079 for JHS, 1.009 for MESA, and 1.054 for 
WFSM cohorts before genomic control. The inflation fac- 
tor for the meta-analysis result was 1.027 after genomic 
control in 2,739,003 SNPs. Results from T2D candidate loci 
were selected for subsequent analyses. No correlation was 
observed between association results and F S t with first PC 
adjustment. In addition, the inflation factor and association 
results with adjustment for the first 10 PCs are similar 
(data not shown), suggesting that adjustment for the first 
PC is sufficient to control for population substructure. 
Association analyses of index SNPs. The association of 
43 independent index SNPs from 40 T2D loci identified 
from GWAS of European and East Asian ancestries is 
shown in Table 2. No significant heterogeneity of associ- 
ations was observed after Bonferroni correction of multi- 
ple comparisons despite heterogeneous study designs. 
Among 41 good-quality SNPs, 23 showed directionally 
consistent association, as in previous reports (binomial 
test, P = 0.27), and seven were significantly associated 
with T2D (Supplementary Fig. 1). The strongest associa- 
tion was observed at TCF7L2 rs7903146 (odds ratio [OR] 
1.30 [95% CI 1.18-1.43]; P = 6.86 X 10" 8 ), followed by 
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FIG. 1. Association plots and LD patterns at the regions flanking the index SNPs at TCF7L2 (A and £), KLF14 (C and Z>), HMGA2 (E and F), 
NOTCH2-ADAMS0 (G and iT), and KCNQ1 (I-L). At the top panel of each plot, the Ar-axis denotes genomic position and the i/-axis denotes the 
— log(P value) for the association of each SNP in AfA. Each locus contains two plots. The plots on the left denote the location of the index SNPs 
(blue arrows) and the color of each data point represents its LD value (r 2 ) with the index SNPs in the HapMap II CEU or JPT+CHB (ASN) 
populations, for loci identified in Europeans and East Asians, respectively. The blue line represents the recombination rate in the respective 
HapMap populations. The LD plots (D' and r 2 ) in the respective HapMap populations are shown in the bottom panel. The plots on the right 
denote the location of the best SNPs (red arrows), and the color of each data point represents its LD value (r 2 ) with the best SNPs in our AfA 
samples. The blue line represents the recombination rate in the HapMap YRI population. The LD plot (D' and r ) for our AfA samples is shown 
at the bottom panel. 



KLF14, KCNQ1, ADCY5, CDKAL1, JAZF1, and GCKR 
(OR 1.10-1.25; 8.12 X 10" 4 < P < 0.05). At KCNQ1, the 
association at the index SNP rs2237892 identified in East 
Asians (16) was significant and had stronger effect size 
(OR 1.25 [95% CI 1.09-1.43]; P = 0.0018) than the index 



SNP rs231362 identified in European populations (12) 
(1.07 [0.95-1.20]; P = 0.25). Nominal associations were also 
observed at the index SNPs in HMGA2 and ZFAND6 (0.011 
< P < 0.05), but the reported risk alleles in European 
populations (12) were protective for T2D in AfA. The 
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FIG. 1. Continued. 



BMI-associated index SNP rs8050136 was not associated 
with T2D with (P = 0.839) or without (P = 0.852) BMI 
adjustment. 

Best SNPs in regional association analyses. The 

regions defined by boundary SNPs in moderate LD (r 2 
>0.3) to the index SNPs were evaluated further. By de- 
fining the best SNP as the most significant SNP in each 
region, four of the significant index SNPs (rs7903146 at 
TCF7L2, rsl 1708067 at ADCY5, rs2237892 at KCNQ1, and 
rsl 1634397 at ZFAND6) were also the best SNPs in the 
respective regions (Tables 2 and 3). After correction for 



multiple comparisons among the effective number of SNPs 
in each region, the association signal among the best SNPs 
at four loci, TCF7L2, KLF14, HMGA2, and NOTCH2- 
ADAM30, remained significant (4.46 X 10" 6 < P emp < 
0.05) (Table 3). 

TCF7L2. The most significant best SNP was rs7903146 
located at intron 3 of TCF7L2 (OR 1.30; P = 6.86 X 10" 8 ; 
P emv = 4.46 X 10" 6 ), which was also the index SNP 
reported in European and East Asian populations (12,44). 
Although rs7903146 was in strong LD (f >0.8) with sev- 
eral nearby SNPs in a 64-kb LD region in CEU (Fig. L4), it 
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FIG. 1. Continued. 
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was located in a 9-kb LD block in AfA (Fig. IE). No SNPs 
were in strong LD with rs7903146 in AfA (r 2 <0.6) or YRI 
(r 2 <0.4). The second strongest SNP was rs7069007 (OR 
1.40 [95% CI 1.23-1.58]; P = 1.63 X 10~ 7 ;P emp = 1.1 X 10" 5 ). 
This SNP was no longer significant after adjustment for 
rs7903146 (P = 0.21), suggesting that rs7903146 repre- 
sented the sole association signal in this region. 
KLF14. At the KLF14 locus, the best SNP rsl3234269 (OR 
1.26 [95% CI 1.14-1.40]; P = 1.55 X 10" 5 ; P emp = 0.001) was 
located at 5' of KLF14, 38 kb from the index SNP 
rs972283. The best and index SNPs resided in two adjacent 



LD blocks and were perfectly correlated in CEU (r 2 = 1) 
(Fig. 1(7) but showed weak or no correlations in AfA (r 2 = 
0.36) (Fig. LD) and YRI (r 2 = 0.01) (Table 3). The associ- 
ation was weaker for the index SNP rs972283 (OR 1.24; P = 
8.12 X 10" 4 ) (Table 2) as compared with the best SNP, and 
was no longer significant after conditioning on the best 
SNP rsl3234269 (P = 0.620). Haplotype analysis of the best 
and index SNPs revealed three common haplotypes (AA, 
AG, and TG). As compared with the AA haplotype, the TG 
haplotype formed by the risk T allele at rs 13234269 and the 
risk G allele at rs972283 was associated with increased risk 
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for T2D (OR 1.22; P = 0.002), but the AG haplotype did not 
show significant association (OR 0.99; P = 0.898). The TG 
haplotype was also associated with increased T2D risk 
when compared with the AG haplotype (P = 0.002) 
(Table 4). 

HMGA2. At the HMGA2 locus, the best SNP rsl2049974 
(T aUele: OR 1.24 [95% CI 1.14-1.35]; P = 1.73 X 10" 6 ; 
P emp =1.36 X 10 4 ) and the index SNP rsl531343, which 
were 100 kb apart, were located at a region of high LD near 
the 5' end of HMGA2. They were uncorrelated in CEU (r 2 = 
0.01) (Fig. IE) but moderately correlated in AfA (r 2 = 0.43) 
(Fig. IF) and YRI (f = 0.67) (Table 3). The association at 
the best SNP remained significant after adjustment for the 
effect of the index SNP (P = 5.07 X 10" 6 ). The reported-risk 
C allele at index SNP snowed trend of association with in- 
creased T2D risk after conditioning on the best SNP (OR 
1.14; P = 0.051), in contrast to the protective effect in the 
unconditional analysis (OR 0.91; P = 0.022) (Table 2). The 
associations at other nearby SNPs were also substantially 
weaker (P > 0.001) after conditioning on the best SNP. 
Three common haplotypes (AC, TG, and AG) were ob- 
served using the best and the index SNPs. With reference to 
the AC haplotype, the TG haplotype was associated with 
increased T2D risk (OR 1.16 [1.06-1.27]; P = 0.002), whereas 
the AG haplotype was associated with decreased T2D risk 
(OR 0.86 [0.76-0.98]; P = 0.024) (Table 4). 
NO TCH2-ADAM30. The best SNP at the NOTCH2/ 
ADAM30 locus was rsl2075171 (OR 1.34 [95% CI 1.15- 

I. 55]; P = 1.25 X 10" 4 ; P emp = 0.011) located at 5' end of 
the nearby gene, REG4. The best and index SNPs were 
located in discrete LD blocks and were uncorrelated with 
each other (Fig. 1G and H). None of the other SNPs in this 
region were significantly associated after correction for 
the effective number of SNPs. The best SNP did not appear 
to demonstrate transferability at this locus. 

KCNQ1. Although the regional associations at the KCNQ1 
locus did not reveal significance after correction for the 
effective number of SNPs, several SNPs in two regions 
near the index SNPs showed nominal associations. At 
KCNQ1 intron 15, the strongest associations were observed 
at the index SNP rs2237892 (OR 1.25 [95% CI 1.09-1.43]; 
P = 0.0018; P emp = 0.22) and rs2283228 (1.23 [1.08-1.40]; 
P = 0.0017; Pewtp = 0.21), which were highly correlated with 
each other (r = 0.91 in AfA, 0.86 in ASN, 1 in CEU, and 
0.92 in YRI) (Fig. 1/ and J and Table 3). At KCNQ1 intron 

II, the best SNP was rs231361 (1.17 [1.07-1.28]; P = 6.64 X 
10" 4 ; P emp = 0.082). This SNP was in weak LD (r 2 = 0.13 in 
AfA, 0.40 in CEU, and 0.09 in YRI) (Fig. and Land Table 3) 



to the insignificant index SNP rs231362. The effects of 
rs231361 were lower after conditioning on the index SNP 
alone (1.13 [1.02-1.25]; P = 0.018), and on both the index 
SNP and a surrogate SNP rs2283202 (r 2 = 0.5 to rs231362 in 
CEU) (1.15 [0.98-1.34]; P = 0.086) (Supplementary Table 
2). Haplotype analysis of the index and best SNPs revealed 
three common haplotypes (AG, GA, and GG). As compared 
with the AG haplotype, the GA haplotype formed by the 
risk G allele at rs231362 and the risk A allele at rs231361 
was associated with increased risk for T2D (OR 1.16; P = 
0.027), but the GG haplotype did not reveal significant 
association (1.02; P = 0.767). The GA haplotype was also 
associated with increased T2D risk when compared with 
the GG haplotype (P = 0.013) (Table 4). The associations 
at the best SNPs in intron 11 (rs231361) and intron 15 
(rs2237892) remained significant after conditioning on 
each other (P = 0.003 and 0.012, respectively), suggesting 
independent associations. 

Population differentiation and natural selection at 
index SNPs. When comparing allele frequencies of the 
risk alleles at index SNPs in the respective European or 
East Asian populations, the absolute difference in risk al- 
lele frequency varied widely in our AfA samples from 0.01 
at HNF1A to 0.58 at PRC1, regardless of whether the index 
SNPs were associated with T2D in AfA (Supplementary 
Table 3). Using the YRI population as a surrogate for AfA 
in this study, F ST values at the index SNPs between CEU 
and YRI, or ASN and YRI populations, were highly signif- 
icant at only one locus, PRC1 (Supplementary Table 2), 
suggesting modest population differentiation. Three index 
SNPs at NOTCH2-ADAM30, HMGA2, and FTO showed 
significant integrated haplotype scores, suggesting recent 
positive selection (Supplementary Table 2). Finally, we 
performed varLD to assess for differential LD around the 
index SNPs. Four loci stBCLUA, IRS1, DGKB/TMEM1 95, 
and PRC1, and one locus at PTPRD, demonstrated signif- 
icant differences in LD between YRI and CEU or ASN 
populations, respectively (data not shown). 



DISCUSSION 

We found that among 41 independent T2D-associated in- 
dex SNPs, only seven in TCF7L2, KLF14, KCNQ1, CDKAL1, 
JAZF1, ADCY5, and GCKR were significantly associated 
with T2D in AfA. The index SNPs in ADCY5 and GCKR 
were initially identified for strong association with fasting 
glucose levels in European populations (13), suggesting 
that genes regulating glucose homeostasis may also affect 



TABLE 4 



Haplotype analyses of index and best SNPs for association with T2D in AfA 


Reported loci 


SNPsIf 


Haplotypet 


Frequency 


OR (95% CI) 


Pt 


KLF14§ 


rsl3234269(b) 


AA 


0.10 


Reference 






rs972283(i) 


AG* 


0.10 


0.99 (0.83-1.18) 


0.898 








0.78 


1.22 (1.08-1.39) 


0.002 


HMGA2 


rsl2049974(b) 


AC* 


0.39 


Reference 






rsl531343(i) 


T*G 




1.16 (1.06-1.27) 


0.002 










0.86 (0.76-0.98) 




KCNQ1W 


rs231362(i) 


AG 


0.18 


Reference 






rs231361(b) 


G*A* 


0.38 


1.16 (1.02-1.32) 


0.027 






G*G 


0.44 


1.02 (0.90-1.15) 


0.767 



Ifb, best SNP in this study; i, reported index SNP. tRisk allele is denoted by asterisk. £ Significant associations (P < 0.05) are set in boldface 
type. §KLF14: AG (reference) vs. TG haplotypes; OR (95% CI) 1.44 (1.14-1.81); P = 0.002. WKCNQl: GG (reference) vs. GA haplotypes; OR 
(95% CI) 1.13 (1.03-1.25); P = 0.013. 
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T2D susceptibility in AfA. The lack of replication for most 
index SNPs may be partly due to insufficient power, as 16 
and 28 SNPs have <50 and <80% power to detect asso- 
ciation for the previously reported effect sizes at a level of 
0.05, respectively (Supplementary Table 1). In addition, 
winner's curse may overestimate the true population effect 
size in the prior GWAS. Only seven index SNPs in AfA had 
an effect size greater than or equal to that reported in 
European and Asian populations (Supplementary Fig. 1). 
The lower effects in AfA further lower the power to detect 
the associations. 

Several index sets of SNPs showed evidence of recent 
positive selection (e.g., NOTCH2-ADAM30 rsl0923931, 
HMGA2 rsl531343, and FTO rs8050136), were rare (e.g., 
PPARG rsl801282 and CENTD2 rsl552224), or revealed 
considerable differences in risk allele frequencies and 
population differentiation (e.g., PRC1 rs8042680) in our 
AfA samples as compared with the European and Asian 
populations. These factors may also impact the direct 
replication of prior associations in AfA. Our findings of 
limited loci showing population differentiation or under- 
selection did not provide support of the thrifty genotype 
hypothesis, consistent with a study on 17 T2D loci (45). 
Interestingly, a locus-wide study of 16 T2D loci in world- 
wide populations revealed many moderately differentiated 
loci in sub-Saharan Africans (e.g., TCF7L2, KCNJ11, 
IGF2BP2, and SLC30A8) and several highly differentiated 
loci in East Asians (e.g., HHEX, THADA, and FTO) that 
contribute to the global differentiation pattern (46). Re- 
cently, Chen et al. (47) showed that risk alleles at 12 T2D 
loci showed high F S t values, as well as a trend of de- 
creasing frequencies from Africa through Europe to East 
Asia. Our study focused on the index SNP comparison 
between Africans and Europeans/Asians, whereas the lat- 
ter two studies examined global differentiation either by 
locus or by index SNPs. The capture of causal rather than 
tagging variants in locus-wide analysis and multiethnic 
comparisons will likely reveal more loci undergoing dif- 
ferentiation and selection. 

Differences in genetic architecture likely affect the pat- 
tern of associations, and lower degrees of LD in AfA may 
facilitate fine mapping of causal variants in loci shared by 
AfA and other populations (28,48). In the locus-specific 
analysis, our sample size had 80% power to detect an OR of 
at least 1.19 for risk allele frequency >0.2 at a level of 5 X 
10~ 4 (corrected for average effective number of SNPs). 
Our results demonstrated that the best SNPs in TCF7L2, 
KLF14, and HMGA2 were close to or the same as the in- 
dex SNPs, and the associations remained significant after 
correction for multiple comparisons. In addition to KCNQ1, 
we were able to fine map these association signals by com- 
paring LD patterns and analyzing haplotypes formed by the 
index and best SNPs. 

The association at the TCF7L2 index SNP rs7903146 
approached genome-wide significance and was the best 
signal within the locus and among the reported T2D loci, 
consistent with prior GWAS showing that this SNP was 
one of the most significant signals in several populations 
(12,17,20). Indeed, rs7903146 was also the strongest SNP 
for the present GWAS, and in one of the CARe plus cohorts 
(WFSM) reported recently (31). Of note, rs7903146 was 
located in a 9-kb LD block in AfA and was weakly corre- 
lated with neighboring SNPs that were not significantly 
associated after adjustment for the effect at rs7903146. In 
contrast, rs7903146 resided in a large, 64-kb LD block in 
Europeans and was strongly correlated with a set of 



different nearby SNPs (Fig. IA and E). The differential LD 
pattern suggests that the risk T allele of rs 7903 146 is lo- 
cated on different haplotypes in AfA and Europeans. The 
differential association suggests that rs7903146 was the 
only SNP showing highly significant association in both 
populations (Fig. 1A and E). Taken together, T2D associ- 
ation at TCF7L2 was transferable to multiple populations, 
including AfA, and rs7903146 is likely the causal variant, as 
suggested by a recent resequencing study (49), or it may 
share the same haplotype with the causal variant across 
different populations. 

At KCNQ1 intron 15, rs2237892 was the index SNP 
identified in East Asians and the best SNP in AfA. Simi- 
larly, the index and best SNPs at KLF14 were highly cor- 
related in Europeans. In both cases, the best SNPs in AfA 
were correlated with the same set of SNPs as the index 
SNPs in East Asians and Europeans, respectively, but at 
a reduced LD. This suggests that the best and index SNPs 
may capture a shared causal variant on the same haplo- 
type in these populations. At KCNQ1, rs2237892 and 
rs2283228 were highly correlated in several populations, 
including Europeans, Asians, Africans, and AfA, so the two 
signals were indistinguishable. However, the reduced LD 
and the absence of association in other nearby SNPs in 
AfA suggest that those are not the causal variants. At 
KLF14, significant association was only observed for the 
TG haplotype carrying risk alleles of both the best and 
index SNPs, but not for the AG haplotype carrying only 
risk allele at the index SNP, suggesting that the causal 
variant may be located closer to the 5' end of KLF14 as 
originally reported (12) and likely resides on the TG hap- 
lotype snared across the studied populations. 

In contrast, the correlations of the best SNPs in AfA and 
the index SNPs in Europeans at KCNQ1 intron 11 and 
HMGA2 were relatively weak (r 2 <0.5) in both pop- 
ulations. At KCNQ1, the best and index SNPs shared 
moderate correlations with some nearby SNPs (Fig. IK 
and L). Haplotype analyses suggest that AfA and Euro- 
peans may share the same causal variant on the GA hap- 
lotype formed by risk alleles of both SNPs. The scenario at 
HMGA2 is more complex. Haplotype analyses showed that 
the TG haplotype (frequency = 0.42) was at risk for T2D, 
whereas the AG haplotype (frequency = 0.18) was pro- 
tective for T2D in AfA when compared with the AC hap- 
lotype (frequency = 0.39). The result of the TG haplotype 
being at risk was consistent with that of single SNP asso- 
ciations where the T allele of the best SNP and the G allele 
of the index SNP increased risk for T2D in AfA. Note that 
the AC haplotype was also at risk for T2D when compared 
with the AG haplotype. The stronger risk effect in the TG 
haplotype than in the AC haplotype explains the spurious 
opposite direction of association at the index SNP with or 
without conditioning on the best SNP. The respective 
haplotype frequencies are substantially different in CEU 
(0.04 for TG, 0.86 for AG, and 0.10 for AC). The TG hap- 
lotype is rare in Europeans. The AC haplotype is likely at 
risk for T2D as compared to the AG haplotype, leading to 
the observation of the index SNP C allele being at risk in 
Europeans. Recently, a multiethnic, gene-centric study 
revealed that rs9668162 at HMGA2 was associated with 
T2D risk in AfA (27). Rs9668162 was moderately corre- 
lated (r 2 = 0.41) with the best SNP rsl2049974 but weakly 
correlated (r 2 = 0.18) with the European index SNP 
rsl531343 in our AfA samples, supporting our finding of 
an independent signal at HMGA2 in AfA. Together, this 
suggests either allelic heterogeneity with different causal 
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variants residing on different haplotypes in different pop- 
ulations, or a common causal variant residing on multiple 
haplotypes at different frequencies shared across pop- 
ulations. 

To our knowledge, this is the first comprehensive fine- 
mapping study of reported T2D loci in AfA. We found that 
only 8 out of 40 loci at TCF7L2, KLF14, KCNQ1, ADCY5, 
CDKAL1, JAZF1, GCKR, and HMGA2 were transferable to 
AfA with significant associations at the index or nearby 
SNPs. It should be noted that the magnitudes of associa- 
tion vary dramatically from strong association {TCF7L2) 
to very nominal evidence, e.g., GCKR and JAZF1. The lack 
of association is likely due to limitations in study power, 
population differentiation, and/or differential LD. Addi- 
tional genetic variants, likely yet to be discovered, will 
unravel the high prevalence of T2D in AfA populations. 
Importantly, the reduced and differential LD patterns in 
AfA at the significant loci support the fine mapping of 
regions of association in prior reports. Subsequent studies, 
including higher-density imputation to the 1,000 genomes, 
trans-ethnic meta-analysis at loci demonstrating pop- 
ulation variation in LD structure, and functional studies, 
will be valuable for localizing causal variants and con- 
firming these findings. 
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